Rank | Count | Beginning |
---|---|---|
22601 | 8484 | El |
54515 | 6910 | La |
35487 | 4207 | En |
30252 | 3318 | Els |
145 | 3311 | A |
49458 | 3008 | I |
75852 | 2604 | Per |
41257 | 2368 | ES |
70978 | 1931 | No |
77965 | 1683 | Però |
64222 | 1445 | Les |
94710 | 1415 | Un |
85441 | 1235 | Segons |
94739 | 1126 | Una |
17916 | 1083 | De |
90778 | 1042 | També |
88111 | 1012 | Si |
6024 | 951 | Amb |
8972 | 902 | Aquest |
93003 | 804 | Tot |
8986 | 794 | Aquesta |
3433 | 765 | Al |
97608 | 749 | Va |
15088 | 638 | Com |
19998 | 604 | Després |
2716 | 598 | Això |
48358 | 584 | Hi |
1821 | 572 | Així |
19366 | 567 | Des |
11327 | 482 | Ara |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV